Skip to content

integrate hashlogger#3647

Merged
cody-littley merged 6 commits into
mainfrom
cjl/hashlog-integration
Jun 30, 2026
Merged

integrate hashlogger#3647
cody-littley merged 6 commits into
mainfrom
cjl/hashlog-integration

Conversation

@cody-littley

Copy link
Copy Markdown
Contributor

Describe your changes and provide context

Wire in the hashlogger to report hashes for lots of different things

@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 30, 2026, 6:21 PM

@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 66.04478% with 91 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.64%. Comparing base (2378fca) to head (1153f77).

Files with missing lines Patch % Lines
sei-cosmos/storev2/rootmulti/hashlog.go 66.26% 18 Missing and 10 partials ⚠️
sei-db/state_db/sc/composite/hashlog.go 0.00% 19 Missing ⚠️
sei-db/state_db/sc/hashlog/hash_logger_impl.go 75.36% 10 Missing and 7 partials ⚠️
app/seidb.go 25.00% 5 Missing and 4 partials ⚠️
sei-db/state_db/sc/memiavl/hashlog.go 72.72% 3 Missing and 3 partials ⚠️
sei-cosmos/baseapp/abci.go 66.66% 2 Missing and 2 partials ⚠️
sei-db/state_db/sc/flatkv/hashlog.go 77.77% 2 Missing and 2 partials ⚠️
sei-db/state_db/sc/hashlog/noop_hash_logger.go 0.00% 4 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3647      +/-   ##
==========================================
- Coverage   58.97%   58.64%   -0.34%     
==========================================
  Files        2263     2226      -37     
  Lines      187223   184451    -2772     
==========================================
- Hits       110421   108171    -2250     
+ Misses      66858    66508     -350     
+ Partials     9944     9772     -172     
Flag Coverage Δ
sei-chain-pr 57.70% <69.85%> (?)
sei-db 70.41% <ø> (ø)
sei-db-state-db ?
sei-db-state-db-pr 74.73% <62.12%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/test_helpers.go 67.01% <100.00%> (+0.10%) ⬆️
sei-cosmos/baseapp/baseapp.go 75.58% <ø> (ø)
sei-cosmos/storev2/rootmulti/store.go 68.66% <100.00%> (+0.63%) ⬆️
sei-db/config/hashlog_config.go 100.00% <100.00%> (ø)
sei-db/config/sc_config.go 88.23% <100.00%> (+0.73%) ⬆️
sei-db/state_db/sc/hashlog/hash_logger_config.go 94.11% <ø> (-0.33%) ⬇️
sei-cosmos/baseapp/abci.go 64.28% <66.66%> (+0.04%) ⬆️
sei-db/state_db/sc/flatkv/hashlog.go 77.77% <77.77%> (ø)
sei-db/state_db/sc/hashlog/noop_hash_logger.go 63.63% <0.00%> (-36.37%) ⬇️
sei-db/state_db/sc/memiavl/hashlog.go 72.72% <72.72%> (ø)
... and 4 more

... and 44 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

done := make(chan struct{})
select {
case h.controlChan <- controlMessage{kind: ctrlColumnChange, hashType: hashType, add: add, done: done}:
select {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sendColumnChange returns nil when the context is cancelled or the sender context is done:
Consider returning a sentinel error or at least h.ctx.Err() / h.senderCtx.Err() so the caller knows the registration didn't actually land.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function now returns an error if context is cancelled

Comment thread sei-cosmos/storev2/rootmulti/hashlog.go Outdated
}

// containsString reports whether s is present in xs.
func containsString(xs []string, s string) bool {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The containsString function can be replaced with slices.Contains

@cody-littley cody-littley Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to set membership, no longer required

Comment thread sei-cosmos/storev2/rootmulti/hashlog.go Outdated
}
}
}
for _, category := range desired {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A set-based approach would be clearer?

desiredSet := make(map[string]struct{}, len(desired))
for _, c := range desired { desiredSet[c] = struct{}{} }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change made

Comment thread sei-cosmos/baseapp/abci.go Outdated
// stores as the next block's header.LastResultsHash; logging it per block surfaces gas/result
// divergence (e.g. between executors) independently of the state AppHash. Only computed when the
// commit store records hashes, and never fails the block on a marshal error.
if _, ok := app.cms.(interface{ SetNextResultHash([]byte) }); ok {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even sc-hash-logger-enable = false, interface{ SetNextResultHash([]byte) } is always true?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@cody-littley cody-littley marked this pull request as ready for review June 30, 2026 17:56
@cursor

cursor Bot commented Jun 30, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes the commit path and ABCI FinalizeBlock/Commit flow, but hash logging failures are non-fatal and skipped when disabled; default-on adds background I/O and disk use on production nodes.

Overview
Adds on-by-default per-block hash logging so operators can compare named hashes across nodes (app hash, block hash, result hash, changeset, memIAVL module/root, flatKV root/DB).

Configuration flows through new sc-hash-logger-* flags in app.toml / parseSCConfigs, with build version stamped into log file names; tests disable logging via FlagSCHashLoggerEnable.

baseapp passes the block header hash and a merkle result hash (from finalized tx results, only when HashLoggingEnabled) into the commit store before Commit.

rootmulti owns orchestration: lazy logger open, dynamic column sync as backends change during migration, changeset capture on the first flush per block, recordBlockHashes on commit, and graceful disable on logger errors without failing commit.

memIAVL, flatKV, and composite expose HashCategories / RecordHashes; the hashlog package gains runtime RegisterHashType / UnregisterHashType with file rotation, / in column names, and optional zero retention limits.

Reviewed by Cursor Bugbot for commit 1153f77. Bugbot is set up for automated code reviews on this repo. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 92c54f3. Configure here.

}
}
}
rs.hashCategories = desired

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed hash category registration skipped

Medium Severity

In syncHashCategories, when RegisterHashType fails, the error is only logged but rs.hashCategories is still replaced with the full desired set. Later commits treat that category as already synced and never call RegisterHashType again, so ReportHash keeps failing with an unknown type and that column stays missing from the hash log until restart.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 92c54f3. Configure here.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR wires the per-block hash logger into rootmulti's commit path (app hash, block hash, result hash, memIAVL/flatKV per-store hashes, changeset), adds runtime column register/unregister with file rotation, and makes retention limits 0-disable-able. The implementation is well-tested and the integration points degrade safely; no blocking correctness issues were found, only a config-consistency note and an ops/performance consideration about the default-on feature sitting in the consensus commit path.

Findings: 0 blocking | 4 non-blocking | 0 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • (Codex) sei-cosmos/server/config/config.go GetConfig() builds a fresh StateCommitConfig literal without HashLogger, so config.GetConfig(...).StateCommit.HashLogger is zero-valued (Enable=false, all retention=0). This file is not in the PR diff so it can't be anchored inline. Verified it is currently harmless: the live hash-logger store is built from app.parseSCConfigs (app/seidb.go:80,102), and app.toml is rendered from customAppConfig/serverconfig.DefaultConfig() (cmd/seid/cmd/root.go:384,415), not from GetConfig. Still worth seeding StateCommit from config.DefaultStateCommitConfig() (or parsing the sc-hash-logger-* keys) in GetConfig to avoid a latent footgun: if any tooling ever renders app.toml from a GetConfig-derived Config, it would emit sc-hash-logger-enable = false and on the next load silently disable hash logging.
  • Performance/ops: hash logging is enabled by default and recordBlockHashes runs inside rootmulti.Store.Commit while holding rs.mtx (the consensus commit hot path). ReportHash/ReportChangeset/RegisterHashType do blocking sends on the logger's buffered control/writer channels, so sustained writer backpressure (slow or near-full disk) could add latency to block commits. It is not a deadlock risk (a fatal logger error cancels senderCtx so sends abort, and the store then disables logging), but the team should confirm the default buffer sizes absorb expected bursts before shipping this on by default.
  • cursor-review.md is empty — the Cursor second-opinion pass produced no output. REVIEW_GUIDELINES.md is also empty, so no repo-specific review standards were applied.
  • Minor: in rootmulti.flush, the captured rs.blockChangeSets aliases the same changeSets slice that is also handed to ssStore.ApplyChangesetAsync, and it is later consumed asynchronously by the hash logger's hasher goroutine via ReportChangeset. This is safe only if both async consumers treat the changeset as read-only; worth a confirming comment since both run concurrently.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wires a per-block hash logger into the commit path (block/result/app/memIAVL/flatKV hashes) with dynamic column registration and retention; the implementation is well-tested and disabled in the test harness. One real but edge-case correctness issue at column-set changes (flagged by Codex) and a couple of default-on hot-path considerations are worth noting, none blocking for a debug/forensics tool.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Column-change vs. in-flight changeset race (agrees with Codex): rootmulti's syncHashCategories() runs at the start of each block's recordBlockHashes, before that block's ReportChangeset. If the category set changes (module add/remove at an upgrade, or memIAVL→flatKV at migration) while the previous block's changeset hash is still being computed asynchronously by the hasher, handleColumnChange only drainComplete()s blocks that are already complete. The prior block is incomplete (its changeset hasn't returned), so the column set is mutated underneath it. Completeness is then judged by count against the new hashTypes: if a column was net-added the prior block can never complete (dropped on clean shutdown with an Info log, or force-flushed partial under buffer pressure); if a column was net-removed it can be emitted with the changeset still missing. This is confined to rare migration/upgrade boundaries and to a non-consensus debugging tool, so it is not blocking, but it undercuts the tool's forensic accuracy at exactly the boundaries operators care about. Consider draining in-flight changesets (or otherwise quiescing pending blocks) before applying a column change, or keying completeness on the per-block expected column set rather than a live count.
  • Hash logging is enabled by default and now sits inline in the consensus-critical Commit path (recordBlockHashes is called under rs.mtx). ReportHash/ReportChangeset do blocking sends to the control loop, which backpressures through the buffered writer channel. Buffers absorb transient stalls, but a sustained slow/near-full disk could backpressure block commit. Worth confirming this risk is acceptable for a default-on feature, or documenting the operator escape hatch (sc-hash-logger-enable=false).
  • With the default BlocksToRetain=0 and an operator-set MaxDiskSize=0, both retention dimensions are disabled and the hash log grows without bound. This is documented as a deliberate choice and the default MaxDiskSize (16 GiB) keeps the out-of-box config bounded, so this is just a heads-up.
  • The Cursor second-opinion review file (cursor-review.md) was empty — that pass produced no output.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

}

// Flush everything that is complete under the current columns to the current file.
h.drainComplete()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] handleColumnChange only flushes blocks that are already complete (drainComplete) before mutating hashTypes. A block whose changeset hash is still in flight on the hasher thread is not complete here, so the column set changes underneath it. Since drainComplete/flushCompleteOnShutdown judge completeness by len(Hashes) < len(h.hashTypes) (a count, not a column-name match), once a column is net-added that in-flight block can never reach the new count and is dropped on clean shutdown (or force-flushed partial under buffer pressure); a net-removed column can let it emit with the changeset still missing. In this PR's caller (rootmulti.syncHashCategories runs before the block's ReportChangeset), this can trigger at module add/remove or memIAVL→flatKV migration boundaries while the prior block's changeset is pending. Consider draining in-flight changesets before applying the column change, or tracking each pending block's expected column set rather than relying on a live count.

@cody-littley cody-littley added this pull request to the merge queue Jun 30, 2026
Merged via the queue into main with commit acc4c3c Jun 30, 2026
67 checks passed
@cody-littley cody-littley deleted the cjl/hashlog-integration branch June 30, 2026 19:18
yzang2019 added a commit that referenced this pull request Jun 30, 2026
* main:
  feat(seid): ConfigManager selection seam (PLT-775 PR1) (#3671)
  fix(evmrpc): limit listener max open connections, configurable via max_open_connections (PLT-704) (#3637)
  LittDB: Keymap threading improvements (#3645)
  integrate hashlogger (#3647)
  fix(metrics): Prometheus metrics output (#3640)
  [codex] Harden multiversion iterator validation (#3656)
  feat(consensus): mock_chain_validation replay build + memIAVL state-sync restore fixes (#3663)
  chore: replace OLD red SeiLogo banner in README with new 2026 Sei lockup (#3670)
  Require absolute path for evmone lib (#3668)
  fix(evmrpc): apply getLogs maxLog cap during merge instead of after (PLT-687) (#3666)
  feat(evmrpc): pre-decode request size admission control (PLT-295) (#3648)
  Make autobahn block production check wait for progress (#3667)
  fix(sei-tendermint): prevent readRoutine goroutine leak on /websocket when writeChan is full (PLT-707) (#3664)
  Per-block littidx flush + single shard (gated on #3645) (#3660)
  fix(evmrpc): bound debug_traceStateAccess memory and add trace admission control (PLT-360) (#3653)
  [codex] bump go-ethereum to v1.15.7-sei-17 (#3657)
  Upodate checkout GHA step across all workflows (#3659)
  Add GoReleaser release pipeline for static seid binaries (#3425)
  Parallelize littidx eth_getLogs across blocks (#3652)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants